Finding Regional Co-location Patterns for Sets of Continuous Variables

نویسندگان

  • Christoph F. Eick
  • Rachana Parmar
  • Wei Ding
  • Tomasz F. Stepinski
  • Jean-Philippe Nicot
چکیده

This paper proposes a novel framework for mining regional colocation patterns with respect to sets of continuous variables in spatial datasets. The goal is to identify regions in which multiple continuous variables with values from the wings of their statistical distribution are co-located. A co-location mining framework is introduced that operates in the continuous domain without the need for discretization and which views regional co-location mining as a clustering problem in which an externally given fitness function has to be maximized. Interestingness of colocation patterns is assessed using products of z-scores of the relevant continuous variables. The proposed framework is evaluated by a domain expert in a case study that analyzes chemical concentrations in Texas water wells centering on colocation patterns involving Arsenic. Our approach was able to identify known and unknown regional co-location patterns, and different sets of algorithm parameters lead to the characterization of arsenic distribution at different scales. Moreover, inconsistent co-location sets were found for regions in South Texas and West Texas that can be clearly attributed to geological differences in the two regions, emphasizing the need for regional co-location mining techniques. Moreover, a novel, prototype-based region discovery algorithm named CLEVER is introduced that uses randomized hill climbing, and searches a variable number of clusters and larger neighborhood sizes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Regional Co-location Patterns for Sets of Continuous Variables in Spatial Datasets

This paper proposes a novel framework for mining regional co-location patterns with respect to sets of continuous variables in spatial datasets. The goal is to identify regions in which multiple continuous variables with values from the wings of their statistical distribution are co-located. A co-location mining framework is introduced that operates in the continuous domain without and which vi...

متن کامل

New Regional Co-location Pattern Mining Method Using Fuzzy Definition of Neighborhood

Regional co-location patterns represent subsets of object types that are located together in space (i.e. region). Discovering regional spatial co-location patterns is an important problem with many application domains. There are different methods in this field but they encounter a big problem: finding a unique optimum neighborhood radius or finding an optimum k value for nearest neighbor featur...

متن کامل

Selection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets

Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process ...

متن کامل

Mining Of Spatial Co-location Pattern from Spatial Datasets

Spatial data mining, or knowledge discovery in spatial database, refers to the extraction of implicit knowledge, spatial relations, or other patterns not explicitly stored in spatial databases. Spatial data mining is the process of discovering interesting characteristics and patterns that may implicitly exist in spatial database. A huge amount of spatial data and newly emerging concept of Spati...

متن کامل

Event Centric Modeling Approach in Colocation Pattern Snalysis from Spatial Data

Spatial co-location patterns are the subsets of Boolean spatial features whose instances are often located in close geographic proximity. Co-location rules can be identified by spatial statistics or data mining approaches. In data mining method, Association rule-based approaches can be used which are further divided into transaction-based approaches and distance-based approaches. Transaction-ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007